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ABSTRACT 

Broadcast  in  a  communication  network  is  the  delivery  of  copies  of  messages  to  all  nodes.  A 
broadcast  protocol  is  reliable  if  all  messages  reach  all  nodes  in  finite  time,  in  the  correct  order 
and  with  no  duplicates.  The  present  paper  presents  an  efficient  reliable  broadcast  protocol 
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1.  /INTRODUCTION 

> 

s  Broadcast  multipoint  communication  is  the  delivery  of  copies  of  messages  to  all  nodes  in  a 
communication  network.  In  a  network  with  mobile  subscribers,  for  example,  the  location  and 
connectivity  to  the  network  of  such  subscribers  may  change  frequently  and  this  information 
must  be  broadcast  to  all  nodes  in  the  network,  so  that  the  corresponding  directory  list  entry 
can  be  updated.  Broadcast  messages  are  used  in  many  other  situations,  like  locating  subscri¬ 
bers  or  services  whose  current  location  is  unknown  (possibly  because  of  security  reasons), 
updating  distributed  data  bases  or  transmitting  information  and  commands  to  all  users 
connected  to  the  communication  network. 

There  are  certain  basic  properties  that  a  good  broadcast  algorithm  must  have  and  the 
most  important  are:  a)  reliability,  b)  low  communication  cost,  c)  low  delay,  d)  low  memory 
requirements.  Reliability  means  that  every  message  must  indeed  reach  each  node,  duplicates 
should  be  recognizable  upon  arrival  at  a  node  and  only  one  copy  accepted,  and  messages 
should  arrive  in  the  same  order  as  transmitted.  Communication  cost  is  the  amount  of  commu¬ 
nication  necessary  to  achieve  the  broadcast  and  consists  of,  first,  the  number  of  messages 
carried  by  the  network  per  broadcast  message  (broadcast  communication  cost),  second,  the 
number  of  control  messages  necessary  to  establish  the  broadcast  paths  (control  communication 
cost),  and,  third,  the  overhead  carried  by  each  message  (overhead  cost).  Low  delay  and  a 
small  buffer  memory  are  basic  requirements  for  any  communication  algorithm,  and  broadcasts 
are  no  exception. 

V 

The  definition  of  reliability  indicated  above  requires  some  discussion,  because  in  some 
applications  not  all  the  requirements  are  necessary.  For  example,  broadcast  of  topological 
information  in  the  new  ARPANET  routing  algorithm  [4]  does  not  require  order  preserydtion 
and  does  allow  duplicates.  On  the  other  hand,  these  properties  are  important/when  the 
information  to  be  broadcast  may  be  pecketized  and  needs  reassembly  at  the  receiving  nodes  as 
well  as  in  applications  where  the  broadcast  consists  of  series  of  commands  whose  order  and 
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nonduplication  is  important.  In  the  present  paper  we  achieve  reliability  in  the  sense  defined 

* 

above. 

The  broadcast  communication  cost  is  minimized  if  the  algorithm  uses  spanning  trees,  but 
normally  there  is  need  for  a  large  control  communication  cost  in  order  to  establish  and 
maintain  these  trees.  However,  the  control  cost  can  be  reduced  considerably  provided  that  the 
routing  mechanism  in  the  network  constructs  routing  paths  that  form  directed  trees  towards 
each  destination,  in  which  case  these  trees  can  be  used  in  the  reverse  direction  for  broadcast 
purposes.  This  general  idea  is  presented  in  [1],  but  the  authors  show  that  the  proposed 
algorithms  named  reverse  path  forwarding  and  extended  reverse  path  forwarding  are  not 

a 

reliable  when  the  routing  algorithm  is  dynamic,  since  in  this  case  nodes  may  never  receive 
certain  messages,  duplicates  may  be  received  and  accepted  at  nodes,  and  the  order  of  arriving 
messages  may  not  be  preserved.  As  said  before,  in  order  to  be  efficient,  the  above  mentioned 
algorithm  require  that  the  routing  paths  to  each  destination  are  directed  trees.  An  adaptive 
routing  algorithm  that  maintains  at  all  times  spanning  directed  trees  rooted  at  the  destination 
has  been  proposed  in  [2]  and  throughout  the  present  paper  we  assume  that  the  protocol  of  [2] 
is  the  underlying  routing  algorithm  in  the  network.  However,  for  the  reasons  stated  before, 
namely  the  fact  that  the  routing  paths  are  dynamic,  the  broadcast  algorithm  of  [l]  is  unreliable 
even  if  applied  to  the  routing  procedures  of  [2]. 

The  purpose  of  the  present  paper  is  to  propose  and  validate  an  algorithm  whose  main 
property  is  that  the  broadcast  propagating  on  the  tree  provided  by  the  routing  protocol  of  [2] 
is  reliable.  It  is  convenient  for  the  purpose  of  our  discussion  to  separate  the  property  of 
reliability  into  two  parts:  completeness  means  that  each  node  accepts  broadcast  messages  in 
the  order  released  by  their  origin  node,  without  duplicates  or  messages  missing,  while 
finite  ness  is  the  property  that  each  broadcast  message  is  indeed  accepted  at  each  node  in  finite 
time  after  its  release.  As  shown  by  the  authors,  the  algorithms  of  [1]  are  neither  complete  nor 
finite.  In  the  algorithm  of  the  present  paper,  completeness  is  achieved  by  requiring  nodes  to 


store  broadcast  messages  in  the  memory  for  a  given  period  of  time  and  by  introducing  counter 

$ 

numbers  at  the  nodes.  Finiteness  is  obtained  by  attaching  a  certain  impeding  mechanism  to 

t 

the  routing  protocol  We  may  mention  here  that  a  broadcast  algorithm  can  be  easily  made 
reliable  if  one  allows  infinite  memory,  unbounded  counter  numbers  and  infinite  overhead  in 
the  broadcast  messages.  The  properties  that  make  our  algorithm  tractable  are:  bounded 
memory,  bounded  counter  numbers,  no  overhead  carried  by  broadcast  messages  (in  form  of 
counter  numbers  or  any  other  kind)  and  the  fact  that  the  impeding  mechanism  is  not  activated 
most  of  the  time. 

In  the  rest  of  the  paper  we  proceed  as  follows:  Sec.  2.1  contains  a  brief  description  of  the 
routing  protocol  of  [2].  Sec.  2.2  and  2.3  build  the  reliable  broadcast  protocol  step  by  step, 
while  its  final  form  and  main  properties  are  given  in  Sec.  3.  The  proofs  of  the  main  theorems 
are  included  in  the  Appendix. 

2.  THE  BROADCAST  PROTOCOL 
2.1  The  Routing  Protocol 

The  underlying  routing  protocol  considered  in  this  paper  is  The  Basic  Protocol  of  [2].  In 
summary,  this  protocol  proceeds  in  updating  cycles  triggered  and  terminating  at  the  destination 
node  named  SINK.  An  updating  cycle  consists  of  two  phases:  a)  control  messages  propagate 
uptree  from  SINK  to  the  leaves  of  the  current  tree  and  each  node  i  performs  this  phase 
whenever  it  receives  a  control  message  MSG  from  its  current  preferred  neighbor  pt  ;  b) 
control  messages  propagate  downtree,  while  new  preferred  neighbors  are  selected  and  this 
phase  is  performed  at  node  i  upon  detecting  receipt  of  MSG  from  ail  neighbors.  Our  basic 
assumption  is  that  all  messages  sent  on  a  link  arrive  in  arbitrary  but  finite  time  after  their 
transmission,  with  no  errors  and  in  the  correct  order  (FIFO).  Observe  that  this  does  not 
preclude  channel  errors  provided  there  is  an  acknowledgement  and  retransmission  protocol  on 
the  link.  Under  these  conditions,  the  routing  protocol  maintains  at  all  times  a  directed 
spanning  tree  rooted  at  SINK. 
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Before  specifying  the  routing  protocol  we  indicate  several  notations  used  in  all  algorithms 


in  this  paper.  Subscript  i  indicates  variables  at  node  i  and  corresponding  variables  without 
subscript  indicate  variables  in  the  received  message.  Whenever  a  message  arrives  from  a 
certain  neighbor  it  is  first  stamped  with  the  identity  of  that  neighbor,  so  that  a  control 
message,  e.g.  MSG(0,  received  at  i  from  neighbor  l  will  be  seen  by  the  processor  at  i  as 
MSO(/,«).  All  variables  and  control  messages  of  the  algorithm  are  indexed  by  SINK  and  the 
protocol  is  performed  independently  for  every  possible  source  of  broadcast  messages  in  the 
network;  in  order  to  simplify  notation,  we  suppress  the  index  SINK  We  also  write  in  short 
For  MSG  (/,•)  instead  of  "Whenever  MSG(«)  is  received  from  neighbor  t,  perform"  and 
denote  by  the  set  of  neighbors  of  L 


We  next  indicate  the  algorithm  at  each  node  that  implements  the  routing  protocol  of  [2]. 
In  the  following  sections  we  shall  need  to  identify  the  updating  cycles  and  it  is  convenient  to 
attach  already  at  this  point  a  counter  number  a  to  each  cycle.  The  cycle  number  will  be 
incremented  by  the  SINK  when  it  starts  a  new  cycle  and  will  be  carried  by  the  control 
messages  MSG  belonging  to  the  given  cycle.  For  the  time  being  the  counter  number  will  be 
unbounded,  but  later  we  shall  show  that  a  binary  variable  is  sufficient 


Variables  at  Node  l 


Ni(0 


counter  number  of  the  current  updating  cycle  as  known  by  i 
(values  0,1,2, — )  (not  used  in  this  and  the  next  algorithms,  but 
introduced  here  for  later  convenience) 


current  preferred  neighbor  at  node  i  3  -—  ?or- 

GRAM 

1  if  MSG  corresponding  to  current  cycle  has  been  already  received  TAB 

junced 

from  neighbor  f,  -  0  otherwise;  V  ;  (initialized  to  0).  loatlon- 


Dlstrlbutlon/ _ 

_ Availability  Codes 

Avail  and/or 
Dlst  Special 


□  □ 
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Routing  Algorithm  for  Node  i  (RA) 

1.  For  MSG(f,a) 

2.  NVO-l 

3.  if  /  -  pj  then;  a^a  ;  send  MSG(etj)  to  all  /«%  except  pj 

4.  if  Nj(/')  -  1  holds  then  :  send  MSGfaj)  to  pt ; 

select  new  pj ;  set  Nj(f')  a  0  V/'etfj 

We  have  deliberately  suppressed  from  the  algorithm  of  [2]  all  variables  that  are  not 
directly  relevant  to  the  broadcast  and  have  not  explicitly  indicated  the  procedure  for  selecting 
the  new  pj  because  it  is  not  important  for  our  purpose,  except  for  the  property  that  it  main¬ 
tains  at  all  times  a  directed  spanning  tree  rooted  at  SINK.  For  simplicity,  pt  will  be  called  the 
father  of  L  The  algorithm  is  indicated  for  a  given  SINK  that  is  not  specified  explicitly  and 
that  becomes  the  source  of  the  broadcast  messages.  The  SINK  performs  the  following 
algorithm  (lines  are  numbered  to  match  equivalent  instructions  to  the  Routing  Algorithm): 

3.  Start  new  cycle  by  “sink  “SINK  +  1,  send  MSG(aSjNK)  to  all  t  £*SINt 

(Note*:  <3>  can  be  performed  only  after  <4>  of  the  previous  cycle 
has  been  performed). 

1.  For  MSG(a) 

2.  ^SINK^^ 

4.  if  NSIyx(f')  -  l  holds  V/'c*SINK  then;  cycle  a  completed; 

set  Nsn^t^)  ■  0  V^€®sink 

In  principle,  the  routing  tree  can  be  used  for  broadcast  purposes  as  follows:  a  node  i  accepts 
only  broadcast  messages  received  from  its  father  p{  and  forwards  them  to  all  nodes  k  whose 
father  is  L  Observe  that  we  distinguish  between  receiving  a  broadcast  message  and  accepting 
it  In  general,  a  broadcast  message  received  at  a  node  may  be  either  accepted  or  rejected, 
depending  on  the  specific  algorithm. 


changes  its  father  p}  (line  <4>  in  the  Routing  Algorithm),  it  sends  two  special  messages:  DCL 
(declare)  to  the  new  father  and  CNCL  (cancel)  to  the  old  father.  Each  node  1  will  have  a 
binary  variable  Zj(k)  for  each  neighbor  k  that  will  take  on  the  value  1  if  i  thinks  that  pk  »  i 
and  0  otherwise.  Receipt  of  DCL  at  node  k  from  i  shows  that  at  the  time  DCL  was  sent,  node 
i  selected  k  as  p^  so  that  zk(i)  is  set  to  L  The  nodes  i  for  which  zk(i)  »  1  are  called  sons  of  k. 
Observe  that  because  of  link  delays,  if  i  is  a  son  of  k  it  does  not  mean  that  at  the  same  time  k 

is  the  father  of  L  We  can  now  write  in  our  notation  the  combination  of  the  above  routing 

algorithm  and  the  Extended  Reverse  Path  Forwarding  (ERPF)  Broadcast  Algorithm  of  [1], 
where  B  denotes  a  broadcast  message: 

Variables  at  Node  i 

Same  as  in  RA,  and  in  addition 

z1(/)  -  1  if  t  is  son  of  i,  ■  0  otherwise;  VfriPj ;  (initialized  to  0). 

ERPF  Broadcast  Algorithm 
t.  For  MSG  (f ,«) 

2.  Nt(/)^l 

3.  if  t  m  pj  then:  a^a  ;  send  MSG  to  all  /«  except  p| 

4.  if  holds  Nj(f')  «  l  holds  then: 

4.a  select  new  pt ; 

4.b  if  new  pi#  old  pt  then:  send  DCL(a)  to  new  p{ 

and  CNCL  to  old  pt ; 

4. c  send  MSG(a)  to  old  pt ;  set  SiUt)-Q 

5.  For  CNCL(f)  set  z,(0  -0 

6.  For  DCL(/,a)  set  z.(/)^l 
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7.  For  B(0 

8.  if  l  m  pt  then:  accept  B;  send  copy  of  B  to  all  V  for  which  Zj(f')  ■  1 

Notes:  Line  <8>  means  that  if  /  ■  pj,  then  B  is  accepted,  otherwise  it  is  rejected.  Recall 
that  a  is  not  used  in  the  algorithm,  but  is  included  in  MSG  and  DCL  for  later  convenience. 

2.2  Completeness 

The  above  broadcast  protocol  is  noncomplete  and  nonfinite.  The  purpose  of  this  section 
is  to  show  that  completeness  can  be  achieved  by  using  memory  and  counter  numbers  at  the 
nodes.  We  achieve  our  goal  without  requiring  broadcast  messages  to  carry  any  counter 
numbers,  so  that  the  algorithm  has  no  overhead  cost.  For  purposes  of  illustration,  it  is  best  to 
impose  for  the  time  being  no  bounds  on  the  memory  or  on  the  counters  and  also  to  describe 
the  protocols  as  if  completeness  was  already  proved.  After  indicating  the  formal  algorithm  we 
shall  show  that  it  is  indeed  complete  and  in  the  following  sections  we  shall  introduce  features 
that  will  make  the  memory  and  the  counters  finite. 

We  require  each  node  1  to  have  a  LISTj  where  every  accepted  broadcast  message  is  stored 
in  the  received  order  and  also  to  keep  a  counter  ICj,  counting  the  accepted  messages  (recall 
that  all  variables  are  indexed  by  SINK).  Completeness  of  the  broadcast  protocol  means  that 
for  any  value  of  ICj,  the  list  LISTj  contains  all  messages  sent  by  the  source  SINK  up  to  and 
including  counter  number  ICi,  with  no  duplicates  and  in  the  correct  order.  In  other  words  if 
icf  denotes  the  value  of  ICj  after  broadcast  message  B  has  been  accepted  at  node  i,  we  have 
icf  ■*  ICs1N-K  for  all  B  and  all  i.  In  the  algorithm  we  also  require  that  every  DCL  message 
sent  by  node  k  will  have  the  format  DCL(a,IC)  where  IC  -  ICk  at  the  time  DCL  is  sent  In 
this  way  when  a  node  i  receives  DCL  from  k,  it  will  have  updated  information  about  the  "state 
Of  knowledge",  denoted  by  ICj(k),  of  its  new  son  k.  Only  broadcast  messages  B  with 
icf  >ICj(k)  need  to  be  sent  by  i  to  k. 


The  formal  algorithm  is  now 


Variables  and  buffets  at  Node  i 
Same  as  in  ERPF,  and  in  addition 

L1ST{  m  buffer  in  which  all  accepted  broadcast  messages  are  stored  in  the  received 
order  (infinite  storage)  (initially  empty) 

IC|  «  counts  accepted  broadcast  messages  (values  0.1.2. »)  (initialized  to  0) 

ICi (t)  m  values  of  lCt  as  presently  known  by  i,  V/<9|  (values  0,1,2,...) 

(initialized  to  0). 

The  Complete  Routing  -  Broadcast  AiyoHthm  (CRB)  for  node  t  • 

1.  For  MSG(f.a) 

2.  Nt(0-1 

3.  jf  /  *  pi  then:  a^a  ;  send  MSG(«|)  to  eS  S'e0l  except 

4.  ifNi(/,)«l  holds  then: 

4a.  select  new  pj 

4b.  i£  new  pt«i  old  pj  then  send  DCL(at4Ci)  to  new  p|  and 

CNCL  to  old  pj 

4c.  send  MSGlo*)  to  old  p, ;  set  N,(/')<**0  V  f  <«, 

5.  For  CNCL(0  set 

6.  For  DCL(f,«,IC) 

set  zt(0**l 

6a.  jf  IC<ICi  then:  send  to  t  contents  of  LISTj  from  IC+1  to  ICi 

while  incrementing  ICt(0  up  to  ICj 
6b.  else  lCt(tt~IC 

7.  For  B(f) 

7a.  if  /  •  pj  then:  ICj^ICj  +  1  ;  include  B  in  LISTj ; 

7b.  Vjetfj  for  which  z1(j)  «  1  and  ICj(j)<ICj  do 
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7c.  send  B  to  j;  ICi(j)-*-IC|(j)  +  1 

The  proof  that  the  CRB  protocol  is  indeed  complete  appears  in  Appendix  A.  Here  we 
only  mention  that  the  important  property  leading  to  completeness  is  the  statement  of  Lemma 
Al,  that  will  be  called  the  session  property.  Broadcast  protocols  associated  with  other  routing 
algorithms  can  be  made  to  have  this  property,  but  several  additions  to  the  algorithm  are 
necessary.  It  is  a  special  feature  of  the  routing  protocol  of  [2]  that  the  session  condition  holds 
with  no  extra  instructions.  Also  observe  that  as  will  be  seen  in  Lemma  A2  and  Theorem  Al, 
completeness  is  achieved  without  requiring  messages  to  carry  their  counter  number. 

2.3  Finiteness 

Completeness  means  that  broadcast  messages  are  accepted  at  nodes  in  the  correct  order 
and  with  no  duplicates  or  messages  missing.  However,  it  docs  not  ensure  that  all  messages  are 
indeed  accepted  at  all  nodes.  The  following  scenario  shows  that,  since  we  allow  arbitrary 
propagation  time  for  messages  on  each  link,  there  may  be  a  situation  in  the  CRB  algorithm 
where  a  node  i  accepts  no  messages  from  a  certain  time  on. 

Consider  Figure  la),  where  <3>a  denotes  execution  of  line  <3>  of  cycle  a  in  CRB. 
Suppose  that  pt » j  between  <4>«  and  <4>(«  +  1),  while  p^j  holds  outside  this  interval. 
Then  upon  execution  of  <3>e  and  <4>a,  node  i  sends  MSG(a)  and  DCL(a,IC)  respectively 
to  j.  If  the  propagation  time  of  DCL(a,IC)  is  long  enough,  it  may  happen  that  node  ] 
performs  <4>a,  cycle  a  is  completed  at  SINK  and  node  j  performs  <3>(a  +  1)  before 
receipt  of  DCL(a,IC)  at  j.  In  this  case,  node  i  may  perform  <3>(a  +  1)  and  <4>(a  +  1) 
before  the  time  it  receives  a  broadcast  message  B  and  then  p4#j  so  that  B  is  not  accepted. 
This  scenario  can  be  repeated  indefinitely,  so  that  B  and  the  broadcast  messages  following  it 
keep  arriving  at  node  i  but  are  never  accepted. 

In  order  to  correct  the  situation  and  achieve  finiteness,  we  introduce  an  "  Impeding 
Mgc  )’  .iC  CRB  algorithm.  Control  messages  MSG(a)  sent  from  j  to  i  will  carry  an 


additional  variable  z  ■  Zj(i).  Any  control  message  MSG(a,z)  with  a  »  +  1,  z  ■  0  received 

from  j  -  pi  will  be  ignored  (see  Fig.  lb).  If  rode  j  receives  DCL(a,IC)  with  a<«j,  in  which 
case  by  Lemma  A3  we  have  a  •  «j  -  1,  node  j  retransmits  M?G(ctjfz)  with  z-1.  Thus  node  i 
postpones  execution  of  <3>  until  it  receives  acknowledgement  from  j  ■  p{,  in  the  form  of 
MSG(afj,z  m  l),  that  the  last  DCL  message  has  been  received  at  j.  In  this  way  we  at  least 
guarantee  that  aU  broadcast  messages  sent  at  the  time  of  receipt  of  DCL(a,IC)  (line  <6a>  in 
CRB  or  <6a><6b>  in  RRB)  will  be  accepted  at  node  L 

For  each  broadcast  message  accepted  at  a  node  i,  it  is  convenient  at  this  point  to  indicate 
explicitly  the  cycle  during  which  it  was  accepted.  To  do  so  we  replace  LISTj  by  a  set  of 
buffers  LISTj(a),a  *  1,2,..  (for  the  meantime  an  infinite  number  of  unbounded  buffers)  and 
all  broadcast  messages  accepted  while  i  was  in  cycle  a  are  stored  in  LISTt(a).  By  definition,  a 
node  i  is  in  cycle  «  if  atj »  a.  Also,  counters  C j(«)  are  used,  counting  messages  accepted 
during  cycle  a.  Out  of  the  messages  corresponding  to  cycle  a,  those  that  have  been  accepted 
at  neighbor  f  as  far  as  i  knows  are  counted  in  Ct(f  )(<*)•  Consequently,  the  counter  IC  is 
redefined  as  the  pair  IC  ■  (a,C(a)),  where  IC/<IC"  means  that  either  a<a"  or  *  «  a"  and 
CVXC’Ca”). 

The  resulting  algorithm  is  given  below  and  the  proof  that  it  is  complete  and  finite  appears 
in  the  Appendix. 

Variables  and  Buffers  at  Node  l 
Same  as  in  ERPF  and  in  addition 

LIST|(a)  -  buffers  in  which  all  broadcast  message  accepted  while  i  is  in  cycle  a  are  stored 

(«  -  0,1,2,..) 

Cj(a)  m  counts  broadcast  messages  accepted  during  cycle  a 

(a  m  0,  1,2,....)  (Ct(«)  -  0,1,2,....) 

C}(f)(a)  -  value  of  Cj(a)  as  presently  known  by  i, 
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1.  For  MSG(/,a,z) 

2.  if  f#p{  then:  —  ! 

3.  if  /  =  pj  and  z=l  then:  ;  ai-«-ct1  +  1  ;  send  MSG(a^,zi(/))  to  all 

tfe9 j  except  pt 

4.  if  Nj(f')  =  l  holds  W'e^  then: 

4a.  select  new  pt 

4b.  if  new  pj#  old  pt  then:  send  DCL(ai,Ci(ai))  to  new  pt  and 

CNCL  to  old  pt 

4c.  send  MSG(aj)  to  old  pj ;  set 

5.  For  CNCL(f,a,C)  set  z^f)  -0 

6.  For  DCL(f,a,C) 

setZi(/)^l 

6a.  if  C<q(«)  then:  send  to  t  contents  of  LISTi(a)  from  C  to  q(a) 

while  incrementing  q(f)(a)  to  Ci(a) 

6b.  if  o  ■  aj  -  1  then:  send  MSG^^/))  to  t  ; 

send  to  t  contents  of  LIST^a^)  from  1  to  q(aj)  while 
incrementing  q(/)  to  q(a) 

6c.  else,  if  Csq(a)  then:  q(/)(a)-*-C 

7.  For  B(/) 

7a.  if  (  m  pj  then:  qia^-^qCaj)  +  1  ;  include  B  in  LISTi(aj)  ; 

7b.  for  which  Zj(j)  *  1  and  Ci(ai)(j)<C1(a1)  do 

send  B  to  j;  qiajXO-q^Kj)  +  l 


Before  proceeding,  we  note  here  that  the  Impeding  Mechanism  slows  down  the  routing 
algorithm,  but  only  in  extreme  situations.  This  is  because  the  Impeding  Mechanism  is  in  fact 
activated  only  in  the  case  when  DCL(a,C)  sent  by  a  node  i  to  j  arrives  there  after  node  j  has 
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performed  <3>  of  cycle  a  +  1).  Since  such  a  DCL  message  is  sent  by  i  when  it  performs 
<4>  of  cycle  or,  this  means  that  propagation  of  DCL  on  link  (i,j)  takes  more  time  than 
propagation  of  the  routing  cycle  a  from  i  all  the  way  to  SINK  plus  propagation  of  cycle 
(a  +  1)  all  the  way  from  SINK  to  node  j.  This  may  indeed  happen  if  we  allow  arbitrary 
delays  on  links,  but  the  chances  are  small. 

3.  THE  RELIABLE  BROADCAST  PROTOCOL 

The  final  form  of  the  broadcast  protocol  will  be  obtained  from  the  RRB  algorithm  after 
making  several  observations. 

a)  The  broadcast  messages  accepted  by  node  i  while  it  is  in  cycle  a  are  exactly  those 

broadcast  messages  released  by  SINK  while  it  is  in  cycle  a  (follows  from  Corollary 

Al). 

b)  If  node  i  is  in  cycle  a,  it  will  never  be  required  to  send  to  neighbors  messages  accepted 
prior  to  cycle  (a  -  1)  and  therefore  it  needs  to  store  only  messages  accepted  during 
the  present  and  the  previous  cycles. 

From  a)  and  b)  follows  that  we  can  make  significant  simplifications  in  RRB.  The 
variables  a, can  be  binary;  only  two  lists  LISTj(O)  and  LISTED  need  to  be  stored;  if  SINK 
is  allowed  to  send  no  more  than  M  broadcast  messages  per  cycle,  those  LIST’S  can  have  finite 
size  M;  only  counters  C^O)  ,  Ct(O(0)  ,  C{(1)  ,  Ct(/)(l)  are  needed  and  all  those  are 
bounded  by  M;  control  messages  MSG  need  not  carry  the  variable  a.  The  resulting  broadcast 
algorithm  has  the  following  properties; 

Properties  of  RRB  (network  has  N  nodes  3nd  E  links) 

1)  Reliability 

2)  Finite  memory  and  counters 


3)  No  overhead  cost 


-  13  * 


4)  Control  communication  cost:  the  routing  protocol  requires  2E  messages  MSG  per  cycle 
whether  broadcast  is  operating  in  the  network  or  not.  Broadcast  requires  no  new 
MSG  messages,  except  in  the  peculiar  situation  described  at  the  end  of  Section  2.3.  In 
addition  we  need  at  most  N  DCL  messages  and  N  CNCL  messages  per  cycle. 

5)  Broadcast  communication  cost:  most  of  the  time  broadcast  messages  propagate  on 
spanning  trees.  The  only  situation  when  two  copies  of  the  same  message  arrive  at  a 
node  (and  one  is  ignored)  is  when  a  broadcast  message  "crosses  paths"  with  a  CNCL 
message.  This  means  that  CNCL  is  sent  by  i  to  j  and  the  broadcast  message  is  sent  by 
]  before  CNCL  has  arrived  and  is  received  by  i  after  CNCL  was  sent.  The  worst  case 
gives  2(N-l)  messages  in  the  net  per  broadcast  message,  but  in  most  cases  this 
situation  will  not  occur,  especially  if  the  propagation  time  of  CNCL  is  small,  so  that 
the  average  is  very  close  to  (N-l)  copies  per  message,  which  is  the  minimal  broadcast 
communication  cost. 

6)  Delay:  the  routing  algorithm  tends  to  find  paths  with  small  total  weight  (sum  of  link 

weights  from  nodes  to  SINK).  The  delay  of  broadcast  message  win  be  small  if  the 
weights  are  link  delays  and  the  traffic  is  symmetric  on  links  or  if  the  weights  of  link 
(i,j)  contain  a  measure  of  the  delay  on  link  (j,i). 
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Appendix  A 

Here  we  prove  that  the  CRB  Protocol  of  Section  2.2  is  complete  and  that  the  RRB 
protocol  of  Section  2.3  is  complete  and  finite.  First  we  recall  several  properties  of  the  routing 
protocol  (RA)  of  [2]  indicated  in  Section  2.1  and  introduce 'additional  definitions: 

a)  in  each  cycle  a,  the  routing  protocol  requires  each  node  i  to  send  exactly  one  MSG(a) 

to  each  neighbor. 

b)  cycle  a  starts  when  SINK  sends  MSG(a)  to  all  its  neighbors  «3>  in  the  algorithm  for 

SINK)  and  ends  when  SINK  receives  MSG(ar)  from  all  neighbors  (line  <4>). 

c)  a  node  i  is  said  to  be  in  cycle  a  while  atj  m  a,  ie.  from  the  time  it  performs  <3>  with 

a^a  and  until  it  performs  <3>  with  c^«-a  +  1. 

d)  at  the  time  just  before  node  i  performs  <3>  we  have  a  »  ctj  +  1,  so  that  aA  always 

increases  by  l. 

e)  whenever  we  need  to  indicate  the  value  of  a  variable,  say  pit  at  a  certain  time  t  we  shall 

write  p,(t]. 

Lemma  Al  (Session  Property) 

Consider  the  CRB  Protocol  of  Section  2.2.  If  a  broadcast  message  B  is  received  at  time  t 
at  node  i  from  j  and  It  is  accepted,  then  B  was  sent  by  j  after  receiving  the  last  DCL  message 
sent  by  i  until  time  t 

Proof 

Let  r<t  be  the  B  was  sent  by  j.  Since  broadcast  messages  are  accepted  only  from 
fathers  (see  <9>  of  CRB)  and  sent  only  to  sons  (see  <7>  and  <10>),  we  have  pf[t]  «  j  and 
*j<DM  -  1.  Thus  the  last  DCL  message  sent  by  1  before  time  t  (at  time  tD  say)  was  indeed 
sent  to  j  and  we  want  to  show  that  it  was  received  by  j  (at  time  r0  say)  before  time  r,  or  in 
other  words  i  is  the  son  of  j  at  time  r  as  a  result  of  this  last  DCL  and  not  of  some  previous 
DCL’s.  This  is  exactly  the  session  property.  The  timing  diagram  is  given  in  Fig.  2.  Consider 
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also  the  last  CNCL  sent  by  i  before  t  to  j  and  let  tc,rc,a  be  respectively  the  time  it  was  sent, 
the  time  it  was  received  and  the  cycle  number  of  i  at  time  tc.  Clearly  tc<tD  and  by  FIFO  we 
also  have  rc<rc.  In  order  to  prove  the  lemma  we  need  to  show  that  td<t.  Observe  now  that 
Zj(i)  “  0  between  rc  and  rD  and  since  Zj(i)[r]  ■  1.  time  r  cannot  be  between  rc  and  td.  It 
is  sufficient  therefore  to  show  that  rc<r.  Observe  that  <4b>  shows  that  CNCL  is  sent  after 
receiving  MSG  (a)  from  all  neighbors,  in  particular  j  and  before  sending  MSG  (a)  to  j  and 
therefore  aj[rc]  -  a,  where  aj  is  the  cycle  number  of  node  j.  Suppose  now  that  rc>r.  Then 
ajMsa  and  B  was  sent  (and  received,  by  FIFO)  from  j  to  i  before  MSG  (a  +  l),  so  that  i 
could  not  have  performed  <4>  of  cycle  a  +  1  before  t  Since  pt  changes  only  in  <4>,  it 
follows  that  pjt]  a  pt[tc  +  ]ft  j  which  is  a  contradiction.  This  proves  the  session  property  of 
the  Routing-Broadcast  Protocol  of  Section  2.2.  Observe  that  the  proof  relies  heavily  on  the 
Properties  of  the  Routing  Protocol  of  [2]. 

Lemma  A2 

If  broadcast  message  B  is  received  at  node  i  from  j  and  is  accepted,  then  icf  -  IC^. 
(Recall  that  icf  denotes  the  value  of  the  counter  IC{  just  after  node  i  has  accepted  B). 

Proof 

Consider  the  notations  of  Lemma  A1  and  of  Fig.  2.  From  line  <4b>  in  the  CRB 
algorithm  follows  that  the  DCL(a,IC)  message  carries  the  counter  number  IC  -  ICjitp].  Since 
Pj  m  j  on  the  interval  (tD,t],  node  i  accepts  during  this  time  broadcast  messages  only  from  j, 
and  by  the  Session  Property,  those  are  sent  only  after  time  rD  at  which  j  performs  <6>,  <7>. 
Now  it  is  easy  to  check  (see  <7>,  <9>-<ll>  for  node  j)  that  in  both  cases,  IC<ICj[rD-] 
and  IC2  node  j  will  consecutively  send  to  i  after  rD  the  broadcast  messages 

corresponding  to  counter  number  IC+1,  IC+2,  etc.  When  they  will  be  received  and  accepted 
at  i,  the  counter  IC|  will  be  increased  respectively  to  IC+1,  IC+2,  etc. 


IiBuI 


The  CRB  algorithm  of  Section  2.2  is  complete,  Le.  icf  -  IC^LNK  holds  for  every  node 
i  and  every  broadcast  message  B. 


Proof 


If  the  above  relation  does  not  hold,  let  i  and  B  be  the  node  and  broadcast  message  for 
it  is  violated  for  the  first  time  throughout  the  network,  and  let  t  be  the  time  B  was 
accepted  at  L  If  B  was  received  from  j,  then  lemma  A2  implies  icj1  »  icf  so  that 
IcJVicJjn*.  But  B  was  accepted  at  j  before  being  accepted  at  i,  violating  the  fact  that  the 
statement  of  the  Theorem  held  throughout  the  network  until  time  t. 


For  future  reference  we  need 


Lemma  A3 


If  DCL(aJC)  arrives  at  node  j,  then  a  •  «j  or  oj-1. 


Proof 


Consider  the  notations  of  Lemma  A1  and  of  Fig.  2.  Then  aj[tp]  —  a  and  therefore 
MSG(«  +  1)  will  be  sent  from  i  to  j  after  the  DCL  message.  Consequently  <4>  of  cycle 
(a  +  l)  can  be  performed  at  j  only  after  r0,  hence  aj[rD]$a  +  1.  On  the  other  hand,  t0  is 
the  time  i  performs  <4>  of  cycle  a  and  hence  MSG(a)  has  been  received  et  i  from  j  before  or 
at  tD,  so  that  ej[r0]^a. 


We  next  proceed  to  the  proof  that  the  RRB  Protocol  of  Section  24  is  complete  and 


-  17  - 


Lemma  A4 

In  the  RRB  Protocol,  if  a  MSG(a\z  «  0)  arrives  at  i  from  j  ■  pi(  (and  by  <2>,  <3>  is 
ignored),  then  MSG(a',z  +  1)  will  arrive  at  i  in  finite  time  from  j  and  then  j  will  still  be  the 
father  of  L 

Proof 

With  the  notations  of  Fig.  1,  where  B  is  replaced  by  MSG(a',z  -  0),  holds  t<td  (since 
z-0)  and  t>t„  (since  pj  -  j).  Now  aj[TD]fcfltj[r]  «  a  m  a^t]  +  l^a^t,,]  +  1  ■  «  +  1, 
where  the  second  equality  follows  from  property  d)  at  the  beginning  of  the  Appendix.  From 
Lemma  A3  follows  that  aj[rD]  »  a  +  1  and  hence  j  will  send  to  i  at  time  rD  control  message 
MSG(a',z  »  1)  according  to  line  <6b>  in  RRB. 

Definition 

A  control  message  MSG(a,z  ■  1)  is  said  to  be  "accepted"  at  node  i  if  it  triggers  execu¬ 
tion  of  <3>  in  RRB  at  node  L  Also,  define  the  counter  number  associated  with  an  accepted 
message  MSG(a,z  «  1)  as  iq(MSG(a,z  -  1»  «  (a.C^a)  *  0). 

Lemma  AS 

'  With  the  above  definitions,  control  messages  with  z»l  propagate  in  RRB  as  if  they  were 
regular  broadcast  messages. 

Proof 

Broadcast  messages  are  accepted  at  i  only  if  they  arrive  from  p{  and  are  sent  to  sons, 
either  when  they  are  accepted  or  in  response  to  DCL  with  IC<ICj.  Control  messages 
MSG(a,z  ■  1)  are  accepted  only  if  they  arrive  from  pt  and  are  sent  to  sons,  either  when  they 
are  accepted  «3>  in  RRB)  or  in  response  to  DCL  with  ICdC*  «6b>  in  RRB).  Moreover, 
MSG(a,z  *  l)  is  accepted  at  i  before  all  broadcast  messages  B  with  icf  *  (a,C|(a)),  since 


•  is  • 


node  i  eaters  cycle  a  as  a  result  of  accepting  MSG(a,z  »  l)  from  and  broadcast  messages 
with  icf  as  above  are  all  accepted  while  i  is  in  cycle  a.  Now,  MSG(a,z  ■»  i)  is  sent  to  any 
node  before  all  such  broadcast  messages  (see  <3>  and  <6b>),  so  that  the  order  is  preserved 
as  welL  Hence  the  statement  of  the  Lemma. 

Corollary  A1 

The  combination  of  broadcast  messages  and  control  messages  with  z»l  performs  a  jointly 
complete  algorithm.  It.  all  such  messages  are  accepted  in  the  order  released  by  the  source 
node  SINK,  with  no  duplicates  and  no  messages  missing. 

Theorem  A2 

The  RRB  protocol  is  complete  and  finite. 

Proof 

From  Lemma  A4  and  the  fact  that  every  routing  cycle  of  the  algorithm  of  [2]  propagates 
in  finite  time,  follows  that  the  propagation  of  control  messaes  with  z»l  is  finite,  namely  every 
node  enters  every  cycle  in  finite  time.  By  Corollary  Al,  all  broacast  messages  released  by 
SINK  while  SINK  is  in  cycle  a  are  accepted  at  each  node  while  the  node  is  in  cycle  a,  and 
since  each  node  enters  cycle  (a  +  1)  in  finite  time,  all  such  broadcast  messages  are  accepted 
at  each  node  in  finite  time. 
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Footnotea 

1.  A  specific  line  in  an  algorithm  will  be  indicated  in  angular  brackets  <  >.  The  algorithm 
we  refer  to  will  be  either  clear  from  the  context  or  indicated  explicitly. 


<3>M 


2,-ttW 


a)  CR% 


TtoJt 


<I>M 


Mcc  '  .je  CRB  algorithm.  Control  messages  MSG(a)  sent  from  j  to  i  will  carry  an 
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